DOCOMBNT EESOME 



ED 1U2 586 



TM 006 U27 



AOTHOR 
TITLE 

INSTITUTION 

POB DATE 
NOTE 



EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Erlich^ Oded; Borich, Gary 

Generalizability of Teacher Process Bbhaviors during 
Reading Instruction. 

Texas Univ.r Austin. Research and Development Center 
for Teacher Education. 
15 Oct 76 

2Cp. ; For related documents^ see ED 042 688^ 066 438, 
and 127 333 

MF-$0.B3 HC-$1.67 Plus Postage. 

^Classroom Observation Techniques; Data Analysis; 
Elementary School Students; Feedback; Grade 2; Grade 
3; ^Interaction Process Analysis; Participant 
Involvement ; Predictor Variables; Primary Education;' 
Reading; ^Reading Achievement ; ^Reading Instruction ; 
Statistical Analysis; ^Student Teacher Relationship; 
^Teacher Behavior; Test Reliability 
*Generalizability Theory; Teacher Child Dyadic 
Interaction System (Brophy) 



ABSTRACT 

Maintaining that the generalizability of behavioral 
measures has not been sufficiently established to permit conclusions 
about the relationship between teacher behavior and student 
achievement, the present research examines the generalizability of 
classroom interaction variables measured by the Brophy-Good 
Teacher-Child Dyadic Interaction System during 2nd and 3rd grade 
reading instruction. Using generalizability theory as the statistical 
basis for data analysis, the number of measurement occasions required 
to reach the C.7 level of generalizability for five clusters of 
classroom interaction variables were identified. Analyses revealed 
that the interaction characterizing reading instruction differs from 
that characterizing other kinds of instruction in regard to: 1) 
proportion of public to private teacher-pupil interactions; 2) nature 
of questions asked; and 3) teacher behavior concerning feedback, 
pupil involvement, and question difficulty. (Author) 
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The Evaluation of Teaching Project — one 
of four projects at the Research and Devel- 
opment Center tor Teacher Education — has 
as iis mission to develop materials and strate- 
yles for teacher training, research and evalu- 
ation. The goals of the Evaluation of Teach- 
ing Project are to develop (1) a conceptual 
framework for the evaluation of teaching, (2) 
a sourcebook of validated teacher evaluation 
and research instruments, and (3) strategies 
•for the evaluation of teacher trainee pro- 
grams. These goals are being carried out 



with funds from the National Institute of 
Education. 

In the process of meeting its objectives, 
the Evaluation of Teaching Project conducts 
systematic research in teacher behavior for 
the purpose of validating instrjments and 
identifying characteristics of effective teach- 
ing. The following report describes one facet 
of this research. A complete listing of studies 
in this report series is available by writing 
to the Evaluation of Teaching Project, R&D 
Center for Teacher Education, University of 
Texas at Austin, 78712. 
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Abstract 



Maintaining that the generaliz.ibility of behavioral measures has not 
been sufficiently established to permit conclusions about the relationship 
between teacher behavior and sti-denL acl'iievement , the present research 
examines ttie generalizabili ty of classroom interaction variables measured 
by the Brophy-Good Teacher-Child Dyadic Interaction System during 2nd and 3rd 
grade reading instruction. Using ge.neralizability theory (Cronhach, Gleser, 
Nanda, and Rajaratnam^ 1972) as the statistical basis for data analysis, 
the number of measurement occasions required to reach the 0.7 level of 
generalizability for five clusters of classroom interaction variables 
vere identified. Analyses revealed that the interaction characterizing 
reading instruction differs from that. cbarac^:erizing other kinds of 
instruction in regard to: i) proportion of public to private teacher- 
pupil interactions; 2) nature of questions asked; and 3) teacher behavior 
concerning feedback, pupil involvement, and question difficulty. 
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Attempts to find correlations between reading instruction and reading 
achievement have previously centered around methods c»f teaching reading (e.g., 
whol.^ word vs. phonics) (Chall, 1967). While some tentative conclusions have 
been drawn about the: relative effectiveness of various methods, no one method 
has been shown to be unquestionably superior. One important approach for 
studying factors related to reading achievement is that of observing operationally 
defined variables of teacher behavior and classroom interaction and then relating 
them to reading achievement. This approach assumes that pupil-teacher classroom 
interactions play a key role in producing pupil learning. By identifying class- 
room interactions which increase pupil achievement^ , researchers can assist 
teachers in constructing', an empirically validated instructional model for the 
teaching or reading. 

Results from past correlational studies of teacher behaviors and student 
outcomes (including, but not restricted to reading achievement) have been 
disappointing, with most correlations low or nonreplicable (Shavelson and Atwood., 
1977). One possible reason for the lack of relationship between classroom 
interactions and student achievement is that the generalizability of behavioral 
measurements has not been adequately examined or established to allow conclusions 
about relationships between teacher behavior and student outcomes to be drawn. 
In this paper we will be concerned with the generalizability of classroom 
interaction measures during reading instruction. 
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The Concept of Generalizahili ty 

The concept of generalizabili ty is based on the notion that the behavior 
observed represents only a sample of the true behavior. If the sample of 
observed measurements contain little or no error, the generalization to the 
characteristic (true) behavior is sound; the accuracy of the measurement is 
high. If the observed scores contain sizable error of measurement, the 
generalization to the characteristic behavior is tenuous; the accuracy is low. 
Meai^ures of teacher-pupil classroom interaction contain potential sources of 
error (facets) such as observation occasion, observers, subject matter, etc. 
Only by considering the effect of all these facets can we determine the extent 
to which teacher behavior measures ^re generalizable . 

For example, in most studies of teaching process, a random sample of 
teachers is observed by two or more raters. Tlie consistency with which the 
teachers are rank ordered on some variable such as "number of verbal reinforce- 
ments" or "number of questions the teacher asks" is interpreted as the 
reliability of the measurement. Typically each teacher's score is an average 
of the racers' scores for that teacher and is usually interpreted as charac- 
teristic of the teacher asking questions or ucing vr»-bal reinforcements. No 
doubt that the use of several raters provides a more precise measure on each 
teacher but what about the*, nature of the pupils taught, the teaching situation, 
the subject matter taught, and other factors that might contribute to the 
instability of the teachers' behavior? l^nlile the measurement is taken in one 
particular sc . tiag and at one particular point in time, it is usually interpreted 
as ge neralizin g over many settings at different points in time. 

Only a fi?w studies on. the generalizabili ty of teacher behavior measures 
have reported on moro Lhnu one facet. Most have either explained how 
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to apply generali?ability theory to examine the problems in measuring teacher 
process variables or Lhey have failed Lo use appropriately the data available. 
(See Erlich, 1976.) IVo appropriate ^;eneralizability studies recently examined 
variables of student-teacher classroom interaction. Erlich and Borich (1976) 
analyzed classroom interactions during nonreading class activities in the 2nd 
and 3rd grades. Erlich (1976) analyzed 5th grade teacher behaviors occurring 
during?, reading and math combined. Because different subject matters, e.g., 
reading, math, social studies, may elicit different kinds and frequencies of 
pupil-teacher classroom interactions, observation data of interactions occurring 
during different subject matters may need to be examined separately. 

Purpose 

The purpose of this study was to identify teacher-pupil interactions 
occurring during beginning ri^ading instruction and to examine the goneraliz- 
ability of these measures of classroom interaction. 

Method 

S ample . The data analysed in tliis sLudy were coilectL?d during the second 
year of a two year replicated study of teacher effectiveness using the brophy- 
Good Teacher-Child Dyadic Interaction System (Brophy aad Evertson, 1976). 
Suojoccs were 26 teachers who had 3 or more years of teaching exporience with 
their 3 most recent years of experience at the 2nd or 3rd grade level. These 
teachers wore selectijd because they had produced consistent pupil learning on 
the Metropolitan Achievement Tests over three consecutive years. Teachers 
were observed from between three and suven times during teachers' reading 
instruction by two different rat.ers who alternated across occasions. Four 
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"^A linear pattern of either gain, consultancy, or decline over the three-year 
period constituted the definition of consistent pupil learning in this study . 
(Brophy, 197 3). 
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teachers who had been observed on less than five occasions were elimiaated 
from. our analysis. For those teachers who were observed on more than five 
occasions, five occasions were selected at random fcr the analysis. Thus, the 
final delta analyzed included 22 teachers each observed on five occasions. 

Design . The design selected for the analysis was a one facet nested 
design; occasions being nested within teachers. Occasions were considered 
to be nested because teachers were observed at different times of day, on 
different days and teaching what may be considered different lessons. 

Even though an implicit so.-rce of error, raters were not considered as a 
potential source ot error in this analysis fur several reasons. First, all 
raters had extensive training during the *:irst year of the study and during 
the summer prior to the second year of the study, enabling them to consistently 
reach a 0.8 agreement. Furthermore, the criteria for agreement requirement 
that raters achieve the 0.8 reliability not only in their coding for each 
category in the observation system, but also on frequency counts within each 
f.jtegory. Disagreements between rate^.s were most often a result of one rater 
being able to code more Lnform.Ttion than another, and, therefore, the rank 
ordering of the teachers was not affected. This implies that there was also a 
minimal teLicher-ra tc-r interact ion; and therefore, raters were considered not 
to be a potential source of error affecting the generalizabiii ty of the measures. 

Instrument. Tlie instrument used to collect data was the Teacher-Child 
Dyadic Interaction Observation System (Brophy and Good, 1969). This instrument 
rit tempts c<,)de all dyadic i n L r ac t Lon^^ (teacher behaviors uith respect to an 
individual child as well as the child's response and interactions with the 
teacher) occurring in the classroom. it contains 167 variables divided into 
two main categories: public response variables, in which the teacher-child 
interaction occurs in a group setting; and private response variables;, in which 
the teacher and child confer privately about the child's individual work. 



Within these two categories of variables, Hrophy and Good identified clusters 
of variables. The public variables included the following clusters: Teacher's 
Mcrthod of Selecting Students to Respond; Difficulty Level of Questions; Type 
of Questions Asked (Academic or Nonacademic) ; Quality of Student Response to 
Questions; Teacher's Feedback Reaction to Student Responses; Student Initiated 
Comments; and Student Initiated Questions. The private interaction variables 
were divided into three clusters: Child Created Contacts (CCC) ; Teacher Afforded 
Contacts (TAC) ; and Behavior Related Contacts. 

S ta tis tical Analysis. The effect of the occasion facet on the generalizability 
of teacher-child interactions was estimated by the application of generalizability 
theory (Cronbach, Gleser, Nanda, and RajaraLnum, 1972). In generalizability 
theory a generalizability study (G study) has two ^virposes. The first is to 
examine the generalizability of the measures (e.g., of teacher behavior) by 
considering the potential sources of error (e.g., occasions and raters) which 
affect the reliability of measurements obtained. Based on this analysis, a 
G study then recommer.ds variables for inclusion in future decision studies 
(I) studies) which t^xamine, lor example, relationships between teacher behaviors 
and student: outcomes. 

For eacli variabJe examined in this study, the G study analysis provided 

the esLirmte of the universe score (true score in classical theory) variance 

L C7""(t) J, and the estimate ot the error variance, which in this design was due 

to the teacher occasion interaction confounded with the occasion variance and 

unidentified sources of error [ o (o,to,e) ]. The formula for obtaining the 

2 

2 o ( c ) 

coefficient of generalizability in this design is p == ^ 

a ( t) + o (o , to ,e) /n 

where n is the number of occasions. Using this formula and based on the 

estimates of the variance components, the number of occasions (n) required to 

obtain a prespecified level of generalizability can be calculated fcr each 

variable. 
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A general izable variable was defined in Lhis study as one for which a 
coefficient of generalizability of 0.7 could be obtained by observing the teacher 
on ten or fewer observation occasions. Not only is ten a practical upper limit 
on the number of observatr'.on occasions which could be used, but also, and of greater 
importance, teacher behaviors which require more than ten occasions to obtain a 
reliable estimate are usually inconsistent and fluctuating, suggesting a need 
to Ledofine and/or reconceptualize these variables. 

Results 

Initial inspection of the data re /ealed that a majority of the variables 
occurred infrequently, iaconsls tently , and were recorded for only a few teachers. 
This pattern of occurrence was characteristic o. - il variables in three clusters — 
Student-Initiated Questions, Student-Inir iaten Comn.ants, and Child Created 
Contacts — and two sub-clusters — Opinion Questions and Non-Academic Self Reference 
Questions. Brophy and Evertson (1976) suggested in their analysis that the 
classroom interactions represented by these variables may not be appropriate 
for teaching fundamental tool skills such as reading and math in the 2nd and 
3rd grades. The rest of the low frequency variables were scattered throughout 
the remaining variable clusters. They appeared to be infrequent mainly because 
of the detailed nature of the observation Instrument which attempts to allow 
for all possible interactions even when their occurrence is not likely (e.g., 
prnise after a wrong answer or criticism after a right answer). None of the 
low frequency variables described above appeared to play any appreciable role 
ill primary reading instruction in the classrooms observed and were, therefore, 
eliminated from the generalizability analysis. 

Another type of low frequency variable was retained for analysis. These 
variables differed from these previously described in that the behaviors occurred 
for at least 20% of the teachers. These variables may be important in 
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distinguishing between effect ive and ineffective teachers despite their rela- 
tively infrequent occurrence acn^ss teachers and their generalizabili ty shoulti 
be examined. 'Iliose found to be generalizeible should be included in correlational 
studies of teacher-pupil classroom interaction and student outcomes to determine 
if they are, in fact, important variables in reading instruction. 

Table 1 presents the results of the analysis for the classroom interaction 
variables analyzed. Variables are grouped into five clusters based on those 
developed by Brophy and Good (1969). The first four clusters contain public 
interactions, and the last cluster contains private interactions. Each 
variable cluster is discussed separately. For each variable the table includes 
the estimates of universe score variance [ a (t) ] and error variance [ c"(o,to,e) ] 
and the number of occasio.ns required to reach a 0.7 level of generalizability . 



INSERT TABLE 1 ABOUT HERE 



The first variable cluster. Teacher's Selection of Respondents, describes 
the way in wiiich the teacher selects students to respond to questions asked. 
The teacher may either preselect (name the child who is to answer before asking 
tiie question), select a child from arriong those who volunteer to answer, or 
select a nonvolunteer . If a student gives the answer before the teacher has 
time to select a student, this is labeled a "call-out." Relatively few 
occasions are needed to obtain a reliable (generalizable) measure of the 
selection of a volunteer, or a non-volunteer or of the frequency of call-outs 
(2, 3, and 4 respectively). The last variable, "preselection of a student" is 
also generalizable, but requires more occasions (9) to reach a 0.7 level of 
generalizability . 



The next cluster. Type of Question, contains variables related to the 
type of questions asked. "Choice questions," "product questions," and "process 
questions" represent difficulty levels of academic questions. To anr.wer a 
choice question, the child must select the correct answer from two or more 
options given by the teacher. To answer a product question, the child must 
give a specific correct answer which can be expressed in a single word or 
shoft phrase. The process question, which is the most complex, requires the 
child to explain the steps which must be followed to solve a problem or to 
reach a conclusion, l^wo of the three variables in this cluster were found to 
be .juneral Liable. "Product questions" and "choice questions," the types 
found to occur most frequently in reading instruction at these grade levels, 
require four and five occasions respectively to reach a 0.7 level of generaliz- 
ability. "Process questions" is nongeneralizable, requiring 16 occasions to 
reach the acceptable level of p.enerali^.ability. 

The third cluster. Quality of Student Response to Questions, evaluates 
student answers to questions. Four variables were considered: "correct" and 
"part-correct." "wrong," and "no response." All can be estimated by three or 
fewer occasions, indicating that of these variables the behaviors are 
highly consistent within a particular reading instruction group. 

only one variable in the Teacher Fec.lback Reaction to Student Responses 
cluster-p.-ais.^ following a correct answer-occurred frequently enough to 
warrant analysis. Apparently, this is the only type of feedback which occurs 
regularly during reading instruction. It needs only three observation 
occasions to obtain a 0.7 level of general i?.ab ill ty . 

The last cluster, Teachor Afforded Contacts (TAC) contains private dyadic 
interactions. TACs may be related to work, to procedures, or to a child's 
behavior, only a few variables in this cluster were analy/.ed because most 
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behaviors occurred iaf requtM t I y . Tho measures of TAG variables related to 
work, and to management procedures were both nongeneralizable . These teachers' 
behaviors, although occurring frequently, fluctuated so greatly that 13 and 18 
occasions would be needed to obtain a reliable estimate of their behavior. On 
the other hand, measures of interactions related to a child's behavior re 
quite consistent. All measures of behavior-related contacts are generalizable 
with the number of occasions required to reach a 0.7 level of generalizab j lity 
ranging from 3 to 5 . 

Discussion 

Tho findings abovo indicate that a majority of the variables 
an.ilyzed can be considered as general izab le if measured by the required number 
of observation occasions. It should be recalled, however, that all other 
Dyadic Tnteraction System variables not presented in the table exhibited such 
low frequency counts that they were excluded from analysis. Although some 
of these might be found ;.;ene ral izabl e , this generalizability statistically 
could result from the fact LhaL their frequency of occurrence tends to be 
cons i.s ten t 1 y zero. 

The l.irge number of Lnfretjuent Leache r-ch L Id dyadic interaction variables 
S'i)igcsts that primary reatling instruction consists of a limited range of such 
behavi.ors. Tiiese findings, however, do not exclude the possibility that some 
classroom interaction variables durin:^ reading instruction at higher grade 
levels mlg,hL be more infrequent and/or consistent at these levels. If such 
is the case, these variables should be analyzed to determine their generalizability. 

Ten observation occasions were selected as the maximum number allowed to 
reach a 0.7 level of )/,eiiera I ir.ai^ i i Ity in this sttuiy. The number of occasions 
required to reach this level for those varinbUis which were general! zable ranged 
from 1-9 occasions. Pa:;t classroom observation studies considering a range of 
subject matters and grad<.> levels, have often used three or fewer occasions to 
measure teacher behaviors (Shavclson and Atwood, 1977). The present analysis 

er|c ! i 
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indicates that some variables require more than three occasions to be measured 
reliably. It should be noted, however, that in this study interactions occurring 
frequently during reading instruction inay, in general, be considered highly 
consistent. Almost half of the general izable variables could be measured reliably 
by the use of three observation occasions and approximately three quarters of 
them by Che use of five observation occasions. 

Classroom observation studies frequently observed teachers teaching different 
subject matters, but combined different subject matters for analysis. The Teacher 
Effectiveness Study (Brophy and Evertson, 1976) coded the reading data 
separately, allowing reading and non-reading class activities to be analyzed 
separately. A comparison of the results of r.his study with those of Erlich 
and Borich (1976), who analyzed the generalizabili ty of the non-reading 
activities, indicates that classroom interactions during reading and non-reading 
in^^t:ruction differ in several significant ways. 

Reading instruction appears to be primarily a public process. With the 
exception oT behavior-related contacts^ almost all of the private interaction 
variables occurred infreciuently . Non-reading class activities appeared balanced 
between public and private interactions and included many more private teacher- 
child interactions (both teacher afforded and child created). For example, 
in Erlich and Borich's analysis, the clut;ter of child created contacts contained 
the largest number of varirbles analyzed. In chis study, the entire cluster 
was eliminated because so few instanc^M of child created contacts during 
reading instruction were recorded. 

I'eachers also asked different types of questions in reading and non-reading 
instruction. Durin>; non-reading activities, almost all questions asked were 
"product questions." "Choice i| i:et; t i ons" appeared so infrequently that this 
variable was i^ot even analyzed. During reading instruction, however, choice 
questions occurred frecjuently and were highly genera lizable (four occasions). 

! 
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Teachers appeared to find choice questions particularly suited to reading 
instruction, but not to other subjects. Teacher questions were more task 
oriented during reading instruction. Self-reference questions were asked 
during non-reading activities, but only academic questions occurred during 
reading instruction. 

Teacher behaviors appeared influenced by the reading context in several 
other important ways. For example, selection of a nonvolunteer during 
non-reading activities was inconsistent and its measurement nongeneralizable, 
while the same behavior was highly consistent and its measurement generalizable 
during reading instruction. The more consistent selection of nonvolun teers 
during reading suggests that the teacher is more likely to insist upon involvin; 
the reluctant, shy, or non-asser c ive child during reading than during ncn-readii 
activities. Another noteworthy difference occurred in the quality of student 
responses to questions. The percentage of correct, wrong, part-correct, and 
no-response answers could be estimated in three or fewer occasions during 
reading instruction, while the number of occasions required during non-reading 
activities was six or greater. This differeuce suggests that the teacher 1:3 
more consistent in gauging the difficulty level of questions during reading . 
instruction than during other activities. A final difference was that feedback 
type reactions were far more limited during reading instruction than during 
non-reading instruction. Only one feedback response — praise after a correct 
response — was employed frequently enough during reading instruction to be 
considered for analysis. 

In summary, the findings of this study suggest that observation data for 
reading instruction should he analyzed separately from data obtained during 
other types of inr^truct ion. Behaviors observed during, say, nuith or social 
studies may not occur liuring readin>.>, and conversely, reading instruction may 
elicit behaviors unique to thnt context. Thir. study found that rending 



instruction encompassed a Ucirrower range of pupil-teacher classroom interaction 
than that found during non-reading instruction in the same classrooms. Even 
when the same behaviors occurred across subject matters, measures of these 
behaviors may be generalizable in one context and not in the other; or the 
number of occasions necessary to reach an acceptable level of generalizability 
may c.ffer. In planning future observational studies of reading instruction, 
researchers should rely upon the findings of this study to ascertain the 
appropriate number of observations needed to obtain generalizable measures of 
teaching behavior during reading instruction. 



13 



Table 1 



Estihjate of Universe Score Variance and Error Variance, 
and Number of Occasions Required to Reach 0.7 Level of Generalizability 
for Dyadic Interaction Variables during Reading Instruction 



Teachers* Selection of Respondents 
Selects volunteer 
Selects Nonvolunteer 
Call-outs by student 
Preselects student 



"2 

o (t) 



105.83 
258.09 
10.86 
14.68 



"2 

0 (o,to,e) 

161.93 
381.72 
19.49 
59.74 



Number of 
Occasions 



2 
3 
4 
9 



Type of Question 

Choice questions 
Product questions 
Process questions 



162.45 
273.78 
2.42 



266.64 
608.93 
16.64 



4 
5 
16 



^ity of Student Resp o nse to 



Part-correc t 
Cor re 0 t 
W ron^ 

No Respoii,s;e 



5.69 
384.24 
19.09 
6.96 



3.13 
342.09 
21.09 
10.43 



Te acher Feedback Ron c f ion t; o 
SJ Lid LMi_c_ Response s 

Praise following; oorrecL answer 



35.34 



41.34 



(Table continued on next page.) 
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Table 1 (cont.) 



Teacher Afforded Contacts 

Work contact involving brief 
contact 

Procedural management contacts 

Behavioral related contacts 

Contacts involving no teache 
error 

Contacts involving teacher 
warning 

Contacts involving teacher 
criticism 



Number of 



0 (t) 0 (o,to,e) Occas ions 

5.41 30.45 13 

5.55 42.80 18 

8.45 11.08 3 

4.80 7.87 4 

0.97 2.22 5 
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